Skip to content

Conversation

@aaron-ang
Copy link
Contributor

@aaron-ang aaron-ang commented Jan 29, 2026

Changes Made

Use heck crate to convert cols and text between string casings.

Related Issues

Closes #2550.

@github-actions github-actions bot added the feat label Jan 29, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

This PR adds seven string case conversion functions (to_camel_case, to_upper_camel_case, to_snake_case, to_upper_snake_case, to_kebab_case, to_upper_kebab_case, to_title_case) to Daft, resolving issue #2550.

Implementation Details:

  • Uses the heck crate (v0.5.0) for efficient case conversions in Rust
  • Implements all functions using a clean macro pattern in case.rs that reduces code duplication
  • Follows the existing codebase pattern where lower and upper are separate functions
  • Properly integrates with Daft's three-layer API: Expression methods, standalone functions in daft.functions, and Series string namespace methods

Test Coverage:

  • Comprehensive parametrized tests at both DataFrame and Series levels
  • Tests cover various input formats (camelCase, kebab-case, PascalCase) and null handling
  • All 7 functions are tested with consistent expected outputs

The implementation is clean, follows existing patterns, and provides useful string manipulation functionality that complements the existing lower, upper, and capitalize functions.

Confidence Score: 5/5

  • This PR is safe to merge with no issues found
  • The implementation is well-structured, follows existing patterns consistently, has comprehensive test coverage, and uses a well-maintained external library (heck) for the core functionality
  • No files require special attention

Important Files Changed

Filename Overview
src/daft-functions-utf8/src/case.rs Implemented 7 case conversion functions using heck library with a clean macro pattern
daft/functions/str.py Added 7 new Python wrapper functions for case conversions
daft/expressions/expressions.py Added 7 Expression methods for case conversions following existing patterns
tests/dataframe/test_string_case.py Comprehensive parametrized test for all 7 case conversion functions at DataFrame level
tests/series/test_utf8_ops.py Added parametrized test for case conversions at Series level with null handling

Sequence Diagram

sequenceDiagram
    participant User
    participant DataFrame
    participant Expression
    participant Functions
    participant Rust_UDF
    participant Heck

    User->>DataFrame: df.select(col("text").to_snake_case())
    DataFrame->>Expression: to_snake_case()
    Expression->>Functions: to_snake_case(expr)
    Functions->>Rust_UDF: _call_builtin_scalar_fn("to_snake_case", expr)
    Rust_UDF->>Rust_UDF: ScalarFn::builtin(SnakeCase, vec![input])
    Rust_UDF->>Rust_UDF: call(inputs)
    Rust_UDF->>Rust_UDF: unary_utf8_evaluate()
    Rust_UDF->>Heck: to_snake_case()
    Heck-->>Rust_UDF: converted string
    Rust_UDF-->>Functions: Series result
    Functions-->>Expression: Expression result
    Expression-->>DataFrame: Expression result
    DataFrame-->>User: DataFrame with converted strings
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 59.03614% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.42%. Comparing base (aa8add2) to head (830742a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-functions-utf8/src/case.rs 0.00% 25 Missing ⚠️
src/daft-functions-utf8/src/lib.rs 0.00% 9 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #6096       +/-   ##
===========================================
- Coverage   72.91%   43.42%   -29.49%     
===========================================
  Files         973      910       -63     
  Lines      126196   112820    -13376     
===========================================
- Hits        92016    48996    -43020     
- Misses      34180    63824    +29644     
Files with missing lines Coverage Δ
daft/expressions/expressions.py 95.24% <100.00%> (+0.11%) ⬆️
daft/functions/__init__.py 100.00% <ø> (ø)
daft/functions/str.py 100.00% <100.00%> (ø)
daft/series.py 92.14% <100.00%> (+0.16%) ⬆️
src/daft-functions-utf8/src/lib.rs 0.00% <0.00%> (-100.00%) ⬇️
src/daft-functions-utf8/src/case.rs 0.00% <0.00%> (ø)

... and 652 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

string casing functions

2 participants